Skip to content

fix: use block_number/block_offset to uniquely identify log rows#2071

Merged
kodiakhq[bot] merged 2 commits intomainfrom
karl/id-field-in-log-filter
May 4, 2026
Merged

fix: use block_number/block_offset to uniquely identify log rows#2071
kodiakhq[bot] merged 2 commits intomainfrom
karl/id-field-in-log-filter

Conversation

@karl-power
Copy link
Copy Markdown
Contributor

@karl-power karl-power commented Apr 8, 2026

Summary

Fixes an issue where clicking a log row to expand it could show the wrong row's details when multiple rows share identical visible column values (Timestamp, ServiceName, Body, etc.).

Root cause: The detail panel query reconstructs a WHERE clause from only the visible columns, without the original search filters. When rows collide on those columns, LIMIT 1 returns whichever row ClickHouse finds first — not necessarily the one clicked.

Fix: Detect and inject ClickHouse's built-in _block_number and _block_offset virtual columns as hidden tiebreakers in the row SELECT. These are available when the table engine was created with enable_block_number_column = 1 and enable_block_offset_column = 1. Since the row WHERE clause builder already iterates all columns in the row data, the block columns are automatically included in detail queries — disambiguating rows without surfacing them in the UI.

A fallback to __hdx_id (a custom materialized column from the prior approach) is retained for any tables that already have that column.

No schema migration required for the primary fix — the approach relies entirely on engine-level settings that already exist on the otel_logs table.

Key changes

  • DBRowTable.tsx: Rename appendSelectWithPrimaryAndPartitionKeyappendSelectWithAdditionalKeys adding an extraKeys: string[] param. Rename useConfigWithPrimaryAndPartitionKeyuseConfigWithAdditionalSelect; when a sourceId is present (row-level query), inspect engine_full metadata for block column flags and inject _block_number/_block_offset as extra keys, falling back to __hdx_id if present.
  • types.ts / source.ts / SourceForm.tsx: Remove uniqueRowIdExpression — the tiebreaker is now auto-detected from the table engine, not manually configured.
  • Schema seed: Remove the __hdx_id materialized column addition from otel_logs; the block column engine flags are sufficient.
  • Tests: New appendSelectWithAdditionalKeys cases covering extraKeys append, deduplication against primary/partition keys, and array-style selects. Scaffold for inferTableSourceConfig tests.

How to test locally

Reproduce the bug on main

  1. Pick any log row from otel_logs. In the ClickHouse UI, run the query that fires when you click a row to open its sidebar.
  2. Insert the same row into otel_logs but change one nested value — e.g. change ResourceAttributes.host.arch from arm64 to arm65.
  3. On the search page, filter for ResourceAttributes.host.arch = arm65. One result appears. Click it to open the sidebar.
  4. The sidebar shows arm64 — it selected the wrong row.

Verify the fix on this branch

  1. Confirm otel_logs was created with enable_block_number_column = 1 and enable_block_offset_column = 1 (check via SHOW CREATE TABLE default.otel_logs).
  2. Repeat steps 1–3 above.
  3. The sidebar now shows arm65 — correct row selected.

Verify fallback path

  1. On a table that has __hdx_id but no block columns, the app still injects __hdx_id as the tiebreaker (same behavior as the previous approach).

References

@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented Apr 8, 2026

🦋 Changeset detected

Latest commit: cef75d2

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 3 packages
Name Type
@hyperdx/api Patch
@hyperdx/app Patch
@hyperdx/otel-collector Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@vercel
Copy link
Copy Markdown

vercel Bot commented Apr 8, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
hyperdx-oss Ready Ready Preview, Comment May 4, 2026 1:00pm

Request Review

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 8, 2026

E2E Test Results

All tests passed • 159 passed • 3 skipped • 1180s

Status Count
✅ Passed 159
❌ Failed 0
⚠️ Flaky 4
⏭️ Skipped 3

Tests ran across 4 shards in parallel.

View full report →

@karl-power karl-power force-pushed the karl/id-field-in-log-filter branch from a7ee44f to 3151b4d Compare April 9, 2026 09:00
@karl-power karl-power marked this pull request as ready for review April 9, 2026 13:53
@github-actions github-actions Bot added the review/tier-4 Critical — deep review + domain expert sign-off label Apr 9, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 9, 2026

🔴 Tier 4 — Critical

Touches auth, data models, config, tasks, OTel pipeline, ClickHouse, or CI/CD.

Why this tier:

  • Critical-path files (1):
    • docker/otel-collector/schema/seed/00002_otel_logs.sql
  • Cross-layer change: touches frontend (packages/app) + backend (packages/api) + shared utils (packages/common-utils)

Review process: Deep review from a domain expert. Synchronous walkthrough may be required.
SLA: Schedule synchronous review within 2 business days.

Stats
  • Production files changed: 6
  • Production lines changed: 81 (+ 73 in test files, excluded from tier calculation)
  • Branch: karl/id-field-in-log-filter
  • Author: karl-power

To override this classification, remove the review/tier-4 label and apply a different review/tier-* label. Manual overrides are preserved on subsequent pushes.

@karl-power karl-power requested a review from wrn14897 April 9, 2026 13:54
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 9, 2026

PR Review

PR #2071 — fix: use block_number/block_offset to uniquely identify log rows

Solid fix for a real bug. Logic is correct, tests cover the new cases, and the fallback chain is sensible. A few things to check:

  • ⚠️ Fragile engine string matchingengineFull.includes('enable_block_number_column = 1') breaks silently if ClickHouse ever serializes settings without spaces (e.g. enable_block_number_column=1). Consider a regex like /enable_block_number_column\s*=\s*1/ to be spacing-tolerant.

  • ⚠️ Existing deployments without the engine flags → The seed SQL change only affects new otel_logs tables. Any instance that created otel_logs before this commit won't have the block columns and will silently fall back to __hdx_id (or no tiebreaker at all). The PR description says these settings "already exist" on existing tables — if that's true for all supported upgrade paths, fine; otherwise a migration note or runtime warning would help.

  • ℹ️ additionalKeysLength when extraKeys overlap primaryKeys → The deduplication via Set is correct, and the new test "should deduplicate extraKeys that overlap with primary/partition keys" confirms it. No action needed, just confirming it's covered.

  • ✅ Removal of uniqueRowIdExpression from Mongoose model and Zod schema is clean — field was already UI-disabled and the auto-detection approach supersedes it.

  • useColumns conditional fetch (enabled: !!sourceId) follows standard TanStack Query patterns and correctly skips aggregate (patterns) queries.

  • ✅ New tests cover extraKeys append, deduplication, and array-style selects.

@karl-power
Copy link
Copy Markdown
Contributor Author

@wrn14897 Do you know if the migrations are needed here? I tried to run them locally which invokes migrate, but I don't see that installed anywhere.

Copy link
Copy Markdown
Member

@wrn14897 wrn14897 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should use the UUID type for the id column (https://clickhouse.com/docs/sql-reference/data-types/uuid). Using toUInt16(rand()) only provides 65,536 possible values, which could lead to collisions.

I’d also suggest raising this with the team, since it involves a schema change. It would be good to make sure we all align on the approach for the ID field before moving forward. cc @knudtty

@karl-power
Copy link
Copy Markdown
Contributor Author

karl-power commented Apr 13, 2026

I think we should use the UUID type for the id column

@MikeShi42 do you have thoughts on this as you originally suggested UInt16?

I'm guessing the reasoning was that UInt16 combined with the other fields from the search query would be enough to reduce the chance of a collision, with the benefit of less storage than UUID

@wrn14897
Copy link
Copy Markdown
Member

I think we should use the UUID type for the id column

@MikeShi42 do you have thoughts on this as you originally suggested UInt16?

I'm guessing the reasoning was that UInt16 combined with the other fields from the search query would be enough to reduce the chance of a collision, with the benefit of less storage than UUID

I understand the reason for reducing storage (to achieve a better compression ratio), and that this ID is only used internally. Also, as you mentioned, the chance of collisions would be lower if the search is always combined with other fields. I just want to confirm that this is intentional

@karl-power
Copy link
Copy Markdown
Contributor Author

@wrn14897 I have shared with the team but didn't receive any feedback yet. Are you happy to proceed like this or prefer to use uuid?

@MikeShi42
Copy link
Copy Markdown
Contributor

@karl-power instead of creating a new column, any thoughts on if we should instead just use _block_number and _block_offset and enabling enable_block_number_column and enable_block_offset_column by default in new tables? This will technically allow other features as well within ClickHouse - so more useful than a random id.

We can tell existing users to run an update to add this new random id column (as suggested in this PR) if they haven't set up their table to use block number/offsets.

@karl-power
Copy link
Copy Markdown
Contributor Author

karl-power commented Apr 28, 2026

@karl-power instead of creating a new column, any thoughts on if we should instead just use _block_number and _block_offset and enabling enable_block_number_column and enable_block_offset_column by default in new tables? This will technically allow other features as well within ClickHouse - so more useful than a random id.

We can tell existing users to run an update to add this new random id column (as suggested in this PR) if they haven't set up their table to use block number/offsets.

@MikeShi42 Yeah I think the block_number/offset combo would work. So you're suggesting to remove the UI control for the unique identifier and try to fallback to __hdx_id if the block settings aren't enabled?

@karl-power karl-power force-pushed the karl/id-field-in-log-filter branch from d790f62 to 327e41e Compare May 1, 2026 00:50
@karl-power karl-power changed the title fix: filter logs with ID field fix: use block_number/block_offset to uniquely identify log rows May 1, 2026
@karl-power karl-power force-pushed the karl/id-field-in-log-filter branch from 327e41e to e44d45a Compare May 1, 2026 01:36
@karl-power karl-power force-pushed the karl/id-field-in-log-filter branch from e44d45a to 0f75ae4 Compare May 1, 2026 01:41
@karl-power
Copy link
Copy Markdown
Contributor Author

@MikeShi42 updated to use block number/offset

ORDER BY (ServiceName, TimestampTime, Timestamp)
TTL TimestampTime + ${TABLES_TTL}
SETTINGS index_granularity = 8192, ttl_only_drop_parts = 1;
SETTINGS index_granularity = 8192, ttl_only_drop_parts = 1, enable_block_number_column = 1, enable_block_offset_column = 1;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great! Although, I wonder if the multiple replica would return different block numbers, though. Maybe something to think about later.

@wrn14897 wrn14897 dismissed their stale review May 4, 2026 12:42

Resolved

Copy link
Copy Markdown
Member

@wrn14897 wrn14897 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome fix!

@kodiakhq kodiakhq Bot merged commit 88b2b64 into main May 4, 2026
18 checks passed
@kodiakhq kodiakhq Bot deleted the karl/id-field-in-log-filter branch May 4, 2026 13:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

automerge review/tier-4 Critical — deep review + domain expert sign-off

Projects

None yet

Development

Successfully merging this pull request may close these issues.

‘expand log line’ not use all filter.

3 participants